Efficient Sparse Matrix-Matrix Multiplication on Multicore Architectures∗
نویسندگان
چکیده
We describe a new parallel sparse matrix-matrix multiplication algorithm in shared memory using a quadtree decomposition. Our preliminary implementation is nearly as fast as the best sequential method on one core, and scales well to multiple cores.
منابع مشابه
cient Sparse Matrix - Matrix Multiplication on Multicore Architectures ⇤
We describe a new parallel sparse matrix-matrix multiplication algorithm in shared memory using a quadtree decomposition. Our implementation is nearly as fast as the best sequential method on one core, and scales quite well to multiple cores.
متن کاملEfficient Multicore Sparse Matrix-Vector Multiplication for Finite Element Electromagnetics on the Cell-BE processor
Multicore systems are rapidly becoming a dominant industry trend for accelerating electromagnetics computations, driving researchers to address parallel programming paradigms early in application development. We present a new sparse representation and a two level partitioning scheme for efficient sparse matrix-vector multiplication on multicore systems, and show results for a set of finite elem...
متن کاملParallel MLEM on Multicore Architectures
The efficient use of multicore architectures for sparse matrixvector multiplication (SpMV) is currently an open challenge. One algorithm which makes use of SpMV is the maximum likelihood expectation maximization (MLEM) algorithm. When using MLEM for positron emission tomography (PET) image reconstruction, one requires a particularly large matrix. We present a new storage scheme for this type of...
متن کاملpOSKI: An Extensible Autotuning Framework to Perform Optimized SpMVs on Multicore Architectures
We have developed pOSKI: the Parallel Optimized Sparse Kernel Interface – an autotuning framework to optimize Sparse Matrix Vector Multiply (SpMV) performance on emerging shared memory multicore architectures. Our autotuning methodology extends previous work done in the scientific computing community targeting serial architectures. In addition to previously explored parallel optimizations, we f...
متن کاملParallel Implementation of Interval Matrix Multiplication
Two main and not necessarily compatible objectives when implementing the product of two dense matrices with interval coefficients are accuracy and efficiency. In this work, we focus on an implementation on multicore architectures. One direction successfully explored to gain performance in execution time is the representation of intervals by their midpoints and radii rather than the classical re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014